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Abstract

   Peer-to-peer (P2P) traffic optimization techniques that aim at
   improving locality in the peer selection process have attracted great
   interest in the research community and have been the subject of much
   discussion.  Some of this discussion has produced controversial
   myths, some rooted in reality while others remain unfounded.  This
   document evaluates the most prominent myths attributed to P2P
   optimization techniques by referencing the most relevant study or
   studies that have addressed facts pertaining to the myth.  Using
   these studies, the authors hope to either confirm or refute each
   specific myth.

   This document is a product of the IRTF P2PRG (Peer-to-Peer Research
   Group).

Status of This Memo

   This document is not an Internet Standards Track specification; it is
   published for informational purposes.

   This document is a product of the Internet Research Task Force
   (IRTF).  The IRTF publishes the results of Internet-related research
   and development activities.  These results might not be suitable for
   deployment.  This RFC represents the consensus of the Peer-to-peer
   Research Group Research Group of the Internet Research Task Force
   (IRTF).  Documents approved for publication by the IRSG are not a
   candidate for any level of Internet Standard; see Section 2 of RFC
   5741.

   Information about the current status of this document, any errata,
   and how to provide feedback on it may be obtained at
   http://www.rfc-editor.org/info/rfc6821.
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1.  Introduction

   Peer-to-peer (P2P) applications used for file-sharing, streaming, and
   real-time communications exchange large amounts of data in
   connections established among the peers themselves and are
   responsible for an important part of the Internet traffic.  Since
   applications have generally no knowledge of the underlying network
   topology, the traffic they generate is frequent cause of congestions
   in inter-domain links and significantly contributes to the raising of
   transit costs paid by network operators and Internet Service
   Providers (ISPs).

   One approach to reduce congestions and transit costs caused by P2P
   applications consists of enhancing the peer selection process with
   the introduction of proximity information.  This allows the peers to
   identify the topologically closer resource among all the instances of
   the resources they are searching through.  Several solutions
   following such an approach have recently been proposed [Choffnes]
   [Aggarwal] [Xie], some of which are now being considered for
   standardization in the IETF [ALTO].  While this document serves to
   inform the protocol work going on in the IETF ALTO working group,
   this document does not specify a protocol of any kind, nor is this
   document a product of the IETF.

   Despite extensive research based on simulations and field trials, it
   is hard to predict how proposed solutions would perform in a real-
   world systems made of millions of peers.  For this reason, possible
   effects and side effects of optimization techniques based on P2P
   traffic localization have been a matter of frequent debate.  This
   document describes some of the most interesting effects, referencing
   relevant studies that have addressed them and trying to determine
   whether and in what measure they are likely to happen.

   Each possible effect -- or myth -- is examined in three phases:

   o  Facts: in which a list of relevant data is presented, usually
      collected from simulations or field trials;

   o  Discussion: in which the reasons supporting and opposing the myth
      are discussed based on the facts previously listed;

   o  Conclusions: in which the authors try to express a reasonable
      measure of the plausibility of the myth.

      Note: Even though a myth is an unfounded or false notion, we have
      nonetheless chosen to provocatively assign a confirmation
      likelihood to each of the myths in Section 3.  This is a
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      whimsical, but we believe effective, attempt that was inspired by
      the TV show "Mythbusters", wherein each myth was "busted", deemed
      "plausible", or "confirmed" by the end of the show.

   This document represents the consensus of the P2PRG.  The first
   version of this document appeared in February 2009, and there was a
   sizeable discussion on the contents of the document thereafter.  The
   document has been improved by incorporating comments from experts in
   the area of peer-to-peer networks as well as casual, but informed,
   users of such networks.  The IRTF community has helped improve the
   number of facts and quality of discussion and enhanced the
   trustworthiness of the conclusions documented.

   This document essentially represents the view of the participating
   P2PRG IRTF community between 2009 and the latter part of 2010; as
   such, it is like a snapshot: frozen in time.  While some aspects are
   confirmed with references to pertinent literature, other aspects
   reflect the state of discussions in the RG at the time of writing and
   may require further investigation beyond the publication date of this
   document.

2.  Definitions

   Terminology defined in [RFC5693] is reused here; other definitions
   should be consistent with the terminology in that document.

   Seeder:

      A peer that has a complete copy of the content it is sharing, and
      still offers it for upload.  The term "seeder" is adopted from
      BitTorrent terminology and is used in this document to indicate
      upload-only peers that are also in other kinds of P2P
      applications.

   Leecher:

      A peer that has not yet completed the download of a specific
      content (but usually has already started offering for upload the
      part it is in possession of).  The term "leecher" is adopted from
      BitTorrent terminology and is used in this document to indicate
      peers that are both uploading and downloading and are used in
      other kinds of P2P applications.

   Swarm:

      The group of peers that are uploading and/or downloading pieces of
      the same content.  The term "swarm" is commonly used in
      BitTorrent, to indicate all seeders and leechers exchanging chunks
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      of a particular file; however, in this document, it is used more
      generally (e.g., in the case of P2P streaming applications) to
      refer to all peers receiving and/or transmitting the same media
      stream.

   Tit-for-Tat:

      A content exchange strategy where the amount of data sent by a
      leecher to another leecher is roughly equal to the amount of data
      received from it.  P2P applications, most notably BitTorrent,
      adopt such an approach to maximize resources shared by the users.

   Surplus Mode:

      The status of a swarm where the upload capacity exceeds the
      download demand.  A swarm in surplus mode is often referred to as
      "well seeded".

   Transit:

      The service through which a network can exchange IP packets with
      all other networks to which it is not directly connected.  The
      transit service is always regulated by a contract, according to
      which the customer (i.e., a network operator or an ISP) pays the
      transit provider per amount of data exchanged.

   Peering:

      The direct interconnection between two separate networks for the
      purpose of exchanging traffic without requiring a transit
      provider.  Peering is usually regulated by agreements taking in
      account the amount of traffic generated by each party in each
      direction.

3.  Myths, Facts, and Discussion

3.1.  Reduced Cross-Domain Traffic

   The reduction in cross-domain traffic (and thus in transit costs due
   to it) is one of the positive effects P2P traffic localization
   techniques are expected to cause, and also the main reason why ISPs
   look at them with interest.  Simulations and field tests have shown a
   reduction varying from 20% to 80%.

3.1.1.  Facts

   1.  Various simulations and initial field trials of the P4P solution
       [Xie] on average show a 70% reduction of cross-domain traffic.
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   2.  Data observed in Comcast's P4P trial [RFC5632] show a 34%
       reduction of the outgoing P2P traffic and an 80% reduction of the
       incoming.

   3.  Simulations of the "oracle-based" approach [Aggarwal] proposed by
       researchers at Technischen Universitat Berlin (TU Berlin) show an
       increase in local exchanges from 10% in the unbiased case to
       60%-80% in the localized case.

   4.  Experiments with real BitTorrent clients and real distributions
       of peers per Autonomous System (AS) run by researchers at INRIA
       [LeBlond] have shown that ASes with 100 peers or more can save
       99.5% of cross-domain traffic with high values of locality.  They
       have also shown that on a global scale, i.e., 214,443 torrents,
       6,1113,224 unique peers, and 9,605 ASes, high locality can save
       40% of global inter-AS traffic, i.e., 4.56 Petabytes (PB) on 11.6
       PB.  This result shows that locality would be beneficial at the
       scale of the Internet.

3.1.2.  Discussion

   Tautologically, P2P traffic localization techniques tend to localize
   content exchanges, and thus reduce cross-domain traffic.

3.1.3.  Conclusions

   Confirmed.

3.2.  Increased Application Performance

   Ostensibly, the increase in application performance is the main
   reason for the consideration of P2P traffic localization techniques
   in academia and industry.  The expected increase depends on the
   specific application: file-sharing applications witness an increase
   in the download rate, real-time communication applications observe
   lower delay and jitter, and streaming applications can benefit by a
   high constant bitrate.

3.2.1.  Facts

   1.  Various simulations and initial field trials of the P4P solution
       [Xie] show an average reduction of download completion times
       between 10% and 23%.
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   2.  Data observed in Comcast's P4P trial [RFC5632] show an increase
       in download rates between 13% and 85%.  Interestingly, the data
       collected in the experiment also indicate that fine-grained
       localization is less effective in improving download performance
       compared to lower levels of localization.

   3.  Data collected in the Ono experiment [Choffnes] show a 31%
       average download rate improvement.

       *  In networks where the ISP provides higher bandwidth for
          in-network traffic (e.g., as in the case of a Romanian ISP
          (RDSNET), described in [Choffnes]), the increase is
          significantly higher.

       *  In networks with relatively low uplink bandwidth (as the case
          of Easynet, described in [Choffnes]), traffic localization
          slightly degrades application performance.

   4.  Simulations of the "oracle-based" approach [Aggarwal] proposed by
       researchers at TU Berlin show a reduction in download times
       between 16% and 34%.

   5.  Simulations by Bell Labs [Seetharaman] indicate that localization
       is not as effective in all scenarios and that the user experience
       can suffer in certain locality-aware swarms based on the actual
       implementation of locality.

   6.  Experiments with real clients run by researchers at INRIA
       [LeBlond] have shown that the measured application performance is
       a function of the degree of congestion on links on which the
       locality policy tries to reduce the traffic.  Furthermore, they
       have also shown that, in the case of severe bottlenecks,
       BitTorrent with locality can be more than 200% faster than
       regular BitTorrent.

3.2.2.  Discussion

   It seems that traffic localization techniques often cause an
   improvement in application performance.  However, it must be noted
   that such beneficial effects heavily depend on the network
   infrastructures.  In some cases, for example, in networks with
   relatively low uplink bandwidth, localization seems to be useless if
   not harmful.  Also, beneficial effects depend on the swarm size;
   imposing locality when only a small set of local peers is available
   may even decrease download performance for local peers.
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3.2.3.  Conclusions

   Very likely, especially for large swarms and in networks with high
   capacity.

3.3.  Increased Uplink Bandwidth Usage

   The increase in uplink bandwidth usage would be a negative effect,
   especially in environments where the access network is based on
   technologies providing asymmetric upstream/downstream bandwidth
   (e.g., DSL or Data Over cable Service Interface Specification
   (DOCSIS)).

3.3.1.  Facts

   1.  Data observed in Comcast's P4P trial [RFC5632] show no increase
       in the uplink traffic.

3.3.2.  Discussion

   Mathematically, average uplink traffic remains the same as long as
   the swarm is not in surplus mode.  However, in some particular cases
   where surplus capacity is available, localization may lead to local
   low-bandwidth leechers connecting to each other instead of trying the
   external seeders.  Even if such a phenomenon has not been observed in
   simulations and field trials, it could occur to applications that use
   localization as the only means for optimization when some content
   becomes popular in different areas at different times (as is the case
   of prime-time TV shows distributed on BitTorrent networks minutes
   after getting aired in North America).

3.3.3.  Conclusions

   Unlikely.

3.4.  Impacts on Peering Agreements

   Peering agreements are usually established on a reciprocity basis,
   assuming that the amount of data sent and received by each party is
   roughly the same (or, in case of asymmetric traffic volumes, a
   compensation fee is paid by the party that would otherwise obtain the
   most gain).  P2P traffic localization techniques aim at reducing
   cross-domain traffic and thus might also impact peering agreements.
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3.4.1.  Facts

   No significant publications, simulations, or trials have tried to
   understand how traffic localization techniques can influence factors
   that rule how peering agreements are established and maintained.

3.4.2.  Discussion

   This is a key topic for network operators and ISPs, and it certainly
   deserves to be analyzed more accurately.  Some random thoughts
   follow.

   It seems reasonable to expect different effects depending on the
   kinds of agreements.  For example:

   o  ISPs with different customer bases: the ISP whose customers
      generate more P2P traffic can achieve a greater reduction of
      cross-domain traffic and thus could probably be in a position to
      renegotiate the contract ruling the peering agreement;

   o  ISPs with similar customer bases:

      *  ISPs with different access technologies: customers of the ISP
         that provides higher bandwidth -- and, in particular, higher
         uplink bandwidth -- will have more incentives for keeping their
         P2P traffic local.  Consequently, the ISP with a better
         infrastructure will be able to achieve a greater reduction in
         cross-domain traffic and will be probably in a position to
         re-negotiate the peering agreement;

      *  ISPs with similar access technologies: both ISPs would achieve
         roughly the same reduction in cross-domain traffic; thus, the
         conditions under which the peering agreement had been
         established would not change much.

   As a consequence of the reasoning above, it seems sensible to expect
   that the simple fact that one ISP starts localizing its P2P traffic
   will be a strong incentive for the ISPs it peers with to do that as
   well.

   It's worth noting that traffic manipulation techniques have been
   reportedly used by ISPs to obtain peering agreements [Norton] with
   larger ISPs.  One of the most used techniques involves injecting
   forged traffic into the target ISP's network, in order to increase
   its transit costs.  Such a technique aims at increasing the relevance
   of the source ISP in the target's transit bill and thus motivate the
   latter to sign a peering agreement.  However, traffic injection is
   exclusively unidirectional and easy to detect.  On the other hand, if
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   a localization-like service were used to direct P2P requests toward
   the target network, the resulting traffic would appear fully
   legitimate and, since in popular applications that follow the
   tit-for-tat approach peers tend to upload to the peers they download
   from, in many cases also bidirectional.

3.4.3.  Conclusions

   Likely.

3.5.  Impacts on Transit

   One of the main goals of P2P traffic localization techniques is to
   allow ISPs to keep local a part of the traffic generated by their
   customers and thus save on transit costs.  However, similar
   techniques based on de-localization rather than localization may be
   used by those ISPs that are also transit providers to artificially
   increase the amount of data exchanged with networks to which they
   provide transit (i.e., pushing the peers run by their customers to
   establish connections with peers in the networks that pay them for
   transit).

3.5.1.  Facts

   No significant publications, simulations or trials have tried to
   study effects of traffic localization techniques on the dynamics of
   transit provision economics.

3.5.2.  Discussion

   It is actually very hard to predict how the economics of transit
   provision would be affected by the tricks some transit providers
   could play on their customers making use of P2P traffic localization
   -- or, in this particular case, de-localization -- techniques.  This
   is also a key topic for ISPs, definitely worth an accurate
   investigation.

   Probably, the only lesson transit and peering agreement have taught
   us so far [CogentVsAOL] [SprintVsCogent] is that, at the end of the
   day, no economic factor, no matter how relevant it is, is able to
   isolate different networks from each other.

3.5.3.  Conclusions

   Likely.
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3.6.  Swarm Weakening

   Peer selection techniques based on locality information are certainly
   beneficial in areas where the density of peers is high enough, but
   may cause damages otherwise.  Some studies have tried to understand
   to what extent locality can be pushed without damaging peers in
   isolated parts of the network.

3.6.1.  Facts

   1.  Experiments with real BitTorrent clients run by researchers at
       INRIA [LeBlond] have shown that, in BitTorrent, even when peer
       selection is heavily based on locality, swarms do not get
       damaged.

   2.  Simulations by Bell Labs [Seetharaman] indicate that the user
       experience can suffer in certain locality-aware swarms based on
       the actual implementation of locality.

3.6.2.  Discussion

   It seems reasonable to expect that excessive traffic localization
   will cause some degree of deterioration in P2P swarms based on the
   tit-for-tat approach, and the damages of such deterioration will
   likely affect most users in networks with lower density of peers.
   However, while [LeBlond] shows that BitTorrent is extremely robust,
   the level of tolerance to locality for different P2P algorithms
   should be evaluated on a case-by-case basis.

3.6.3.  Conclusions

   Plausible, in some circumstances.

3.7.  Improved P2P Caching

   P2P caching has been proposed as a possible solution to reduce cross-
   domain as well as uplink P2P traffic.  Since the cache servers
   ultimately act as seeders, a cache-aware localization service would
   allow a seamless integration of a caching infrastructure with P2P
   applications [EDGE-CACHES].

3.7.1.  Facts

   1.  A traffic analysis performed in a major Israeli ISP [Leibowitz]
       has shown that P2P traffic has a theoretical caching potential of
       67% byte-hit-rate.
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3.7.2.  Discussion

   P2P caching may be beneficial for both ISPs and network users, and
   locality-based optimizations may help the ISP to direct the peers
   towards caches.  Anyway, it is hard to figure at this point in time
   if the positive effects of localization will make caching superfluous
   or not economically justifiable for the ISP.

3.7.3.  Conclusions

   Plausible, if cost-effective for the ISP.

4.  Security Considerations

   This document is a compendium of observed issues in peer-to-peer
   networks with an informed look at whether the issue is known to
   actually exist in the network or whether the issue is, well, a non-
   issue.  As such, this document does not introduce any new security
   considerations in peer-to-peer networks.
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   | Uplink Bandwidth     |       | X         |            |           |
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